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Final  Report 


The  original  focus  of  this  work  was  on  the  automatic  acquisition  (learning)  of  stochastic 
models.  The  motivation  was  the  lack  of  such  models  for  military  problems,  specifically 
air-campaign  planning,  and  the  existence  of  new  algorithms  that  could,  if  the  appropriate 
models  were  available,  considerably  improve  the  accuracy  and  efficiency  of  military  plan¬ 
ning.  This  final  report  describes  the  course  of  our  investigations,  some  unanticipated  turns, 
and  the  direction  that  our  research  has  taken  as  a  consequence  of  what  we  have  learned. 

In  recent  years,  we  developed  new  models  and  techniques  for  representing  stochastic 
processes  [Dean  and  Kanazawa,  1988,  Boutilier  et  al .,  1995a]  that  enabled  us  to  compactly 
represent  problems  that  couldn’t  be  represented  at  all  using  previous  techniques.  We  also 
had  met  with  success  in  solving  such  problems  using  new  methods  that  directly  exploit  the 
structure  in  the  representations  [Boutilier  et  al. ,  1995b,  Dean  et  aZ.,  1995,  Dean  and  Lin, 
1995,  Lin  and  Dean,  1994,  Lin  and  Dean,  1996,  Lin  and  Dean,  1995].  Our  models  achieved 
efficiency  of  representation  by  factoring  the  state  and  action  spaces  of  a  dynamical  system 
using  a  set  of  features  (variously  called  “state  variables”  or  “fluents”).  For  example,  the 
state  space  for  an  air-campaign  planning  problem  would  have  state  variables  for  the  status 
of  each  target  and  the  location  of  each  aircraft. 

We  believed  when  we  wrote  the  proposal  for  this  grant  that  it  would  be  relatively 
straightforward  to  extend  methods  for  learning  hidden  Markov  models  [Rabiner  and  Juang, 
1986]  to  handle  our  factored  representations.  For  certain  specialized  problems,  researchers 
had  already  met  with  some  success  in  doing  exactly  this  [Ghahramani  and  Jordan,  1995]. 
However,  in  trying  to  carry  out  our  research  agenda  1.  we  encountered  two  problems:  First, 
factored  models  have  much  more  structure  than  traditional  (flat)  hidden  Markov  models 
and  the  class  of  problems  we  were  particularly  interested  in  (highly  combinatoric)  was  not 
amenable  to  the  specialized  methods  in  the  literature.  Second,  in  many  cases,  even  if  you 
could  learn  the  models,  you  couldn’t  necessarily  use  the  resulting  representations  to  solve  the 
corresponding  decision  problems.  We  found  that  we  had  some  way  to  go  in  understanding 
the  structure  of  factorial  models  and  how  to  exploit  this  structure  computationally  before 
we  could  learn  such  models  effectively. 

Our  first  breakthrough  came  in  1997,  when,  in  trying  to  understand  the  work  of  Boutilier 
et  a/.,  we  discovered  how  to  characterize  the  structure  their  algorithm  was  taking  advantage 
of  in  terms  of  bisimulation  equivalence  and  automata  equivalence  [Hartmanis  and  Stearns, 
1966].  The  result  was  a  series  of  papers  [Dean  and  Givan,  1997,  Givan  and  Dean,  1997, 
Dean  et  a/.,  1997]  in  which  we  were  able  to  explain  the  sources  of  combinatorial  leverage 

1  We  explored  a  wide  range  of  approaches  during  the  first  year  and  carried  out  extensive  experiments. 
A  good  deal  of  the  material  compiled  during  that  first  year  is  available  at  the  Brown  Computer  Science 
Dynamics  web  site:  http://www.cs.brown.edu/research/ai/dynamics/. 
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in  the  structured  methods  of  Boutilier  et  al  and  others.  We  found  that  the  structure  was 
due  to  certain  symmetries  in  the  dynamics,  that,  in  certain  cases,  could  be  exploited  to 
significantly  reduce  computation  time.  During  the  same  period,  we  developed  algorithms 
that  were  able  to  realize  these  reductions  in  computation  time. 

We  also  found  other  sources  of  computational  leverage  that  were  not  accessible  to  these 
methods.  In  particular,  we  found  sources  of  computational  leverage  in  air-campaign  plan¬ 
ning  problems  that  current  algorithms  could  not  handle.  This  prompted  us  to  consider  the 
sort  of  structure  arising  in  systems  that  can  be  decomposed  into  smaller,  weakly-coupled 
component  systems.  And,  in  1998,  we  described  a  type  of  structure  found  in  air-campaign 
planning  problems  and  related  logistics  problems;  we  also  developed  approximation  algo¬ 
rithms  that  performed  extremely  well  on  such  problems  [Meuleau  et  al ,  1998]. 

Following  this  unanticipated  side  journey,  we  are  now  returning  to  the  problem  of  auto¬ 
matically  learning  stochastic  models  from  data.  We  now  have  a  great  deal  more  experience 
in  actually  constructing  (painstakingly  by  hand)  models  for  air-campaign  planning  and  re¬ 
lated  problems.  We  also  have  a  much  better  idea  of  what  aspects  of  such  problems  are  useful 
to  represent  in  the  sense  that  they  have  an  impact  on  the  performance  of  decision-making 
algorithms  and  they  provide  computational  leverage  in  solving  these  highly  combinatoric 
problems.  In  recent  months,  we  discovered  a  method  for  symbolically  solving  a  system 
of  equations  of  the  form  found  in  factored  Markov  decision  processes.  We  also  developed 
two  structured  iterative  methods  based  on,  respectively,  conjugate  gradient  search  and  an 
acceleration  method  attributed  to  Chebyshev.  These  methods  are  of  note  particularly  for 
the  fact  that  they  enable  us  bring  to  bear  a  large  body  of  work  on  numerical  methods  for 
solving  systems  of  equations,  assuming  of  course  that  we  can  figure  out  how  to  factor  the 
equations. 

We  are  currently  working  on  “compiler”  technology  that  will  work  in  concert  with 
learning  algorithms  to  explore  the  space  of  tractable  models,  rather  than  the  much  larger 
space  of  all  dynamical  models,  many  of  which  would  do  us  no  good  even  if  we  were  to  learn 
them.  This  compiler  technology  would  enable  us  to  identify  and  exploit  the  structure  due 
to  symmetries  in  the  dynamics  arising  from  (stochastic)  bisimulation  equivalence  [Dean  and 
Givan,  1997]  and  due  to  weakly-coupled  subprocesses  [Meuleau  et  aZ.,  1998].  We  are  the 
first  to  admit  that  this  work  is  not  traditional  Al,  but  we  are  making  significant  progress 
and  our  approaches  and  methodology  have  been  adopted  by  a  number  of  labs. 
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